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Abstract 

We propose a new search strategy for high-multiplicity hadronic final states. When new 
particles are produced at threshold, the distribution of their decay products is approximately 
isotropic. If there are many partons in the final state, it is likely that several will be clustered 
into the same large-radius jet. The resulting jet exhibits substructure, even though the 
parent states arc not boosted. This "accidental" substructure is a powerful discriminant 
against background because it is more pronounced for high-multiplicity signals than for 
QCD multijets. We demonstrate how to take advantage of accidental substructure to reduce 
backgrounds without relying on the presence of missing energy. As an example, we present 
the expected limits for several i?-parity violating gluino decay topologies. This approach 
allows for the determination of QCD backgrounds using data-driven methods, which is crucial 
for the feasibility of any search that targets signatures with many jets and suppressed missing 
energy. 
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I. INTRODUCTION 

Our approach to jet physics is undergoing a renaissance. While most LHC studies use the energy 
and momentum of a jet, there is growing appreciation for the wealth of information that can be 
extracted by analyzing a jet's internal structure (see [IHS] for reviews). Jet substructure gained 
traction when it was shown to increase the LHC sensitivity to Higgs boson decays into 6-quarks jl] . 
Since then, jet substructure has been applied by theorists to a variety of scenarios [5H2"7]. and its 
power has been demonstrated experimentally in Tevatron |28|, I29j and LHC [30 36J searches. 

In all existing studies, jet substructure has been used to search for boosted resonances with 
collimated decay products that are reconstructed as a single jet. For a typical event at the LHC, 
parent particles are produced near threshold; the decay products are boosted for the small fraction 
of signal events produced with significant transverse momentum, 1 or in the case where the parent 
particle decays to significantly lighter daughters. In this paper, we explore a new application for 
jet substructure techniques that does not rely on having collimated decay products. 

We demonstrate that substructure technology is useful in the non-boosted regime for models 
that yield a high multiplicity of hadronic final states. This strategy relies on the fact that when new 
particles with O(TeV) masses are produced at threshold, their decay products tend to be distributed 
isotropically in the detector. Our proposal requires an event to contain several (specifically, four or 
more) large-radius jets defined using the anti-fey algorithm [39] with angular size R = 1.2. Because 
these so-called "fat" jets can cover a large fraction of the effective detector area, several decay 
partons from a high- multiplicity signal will often get clustered into a single fat jet. Non-boosted 
final states can therefore manifest "accidental substructure." 

Requiring multiple fat jets with non-trivial substructure greatly reduces QCD contamination. 
For an event to have four fat jets, it must have at least this many well-separated hard partons. 
The presence of substructure in the remaining QCD sample is most likely to occur when one or 
more isolated partons undergoes a hard 1 — > 2 splitting. Because this process is dominated by a 
soft and/or collinear singularity, the probability decreases as the energy and separation of the final 
states increases. As a result, QCD events typically have suppressed substructure. 

Figure [T] illustrates why accidental substructure is useful for distinguishing between a typical 



For example, the signal efficiency when targeting boosted gluinos is roughly C(few %) at the LHC [371 138| . 
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signal and background event. These "lego plots" show the spatial distribution of calorimeter 
activity in the r\ — <p plane, where rj is pseudorapidity and <j) is azimuthal angle. The left panel 
is a lego plot for a signal event with up to 18 partons in the final state; the signal is gluino pair 
production, where each gluino decays to a pair of top quarks and an unstable neutralino that 
decays to three partons (see the left diagram in Fig. [2]) . The right panel shows the lego plot for a 
QCD event. The different colors correspond to different fat jets in the event. It is clear that the 
fat jets from signal have more pronounced substructure than the ones from QCD. 

Figure [T] suggests that cutting on the number of small-radius (R ~ 0.4) jets may suffice to 
distinguish signal from background. An explicit high jet multiplicity search requires accurate 
modeling of the QCD background, which has intrinsic theoretical challenges. The current state 



g->tt + 3j QCD 




FIG. 1: Lego plots showing the distribution of calorimeter activity in the r/ — cf> plane. The different colors 
correspond to different fat jets; within each panel, darker colors signify higher px in a given detector cell. 
Note that the relative pt scale is different for the signal and background example. The signal (left panel) is 
pair production of 500 GeV gluinos with g — > tt + which yields up to 18 partons in the final state. The 
gluinos have transverse momenta of 120 and 65 GeV, so they are essentially at rest. A QCD multijet event 
is depicted in the right panel. The circles are centered on the clustered fat jet with a radius of R = 1.2 
to schematically illustrate the extent of each fat jet. There is significant substructure for the signal and 
suppressed substructure for the background. 
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of the art is tree-level QCD calculations that rely on matrix element-parton shower matching up 
to six jets. Because additional jets must be generated by the parton shower, these calculations 
systematically underestimate the pt spectrum of the high multiplicity tail. Higher multiplicity, 
matched, next-to-leading order calculations are not anticipated in the near future, implying that 
precision modifications to the shapes of the QCD distribution will not be known. Finally, even 
once this has been achieved, there is the computational limitation associated with populating the 
entire 3-n dimensional phase space for events with n jets. As a result, theorists should validate 
Monte Carlo background predictions against data to derive plausible limits. There exist studies 
from the CMS and ATLAS collaborations that present 6 jet |4"U] and 8 jet [5T] distributions. 
However, these do not provide enough information to place cuts on the number of small-radius 
jets larger than ~ 6-8. This constrains theoretical investigations of high multiplicity searches with 
small-radius jets. 

An experimental analysis targeting many small-radius jets must obtain the multijet backgrounds 
from data. Current data-driven methods for determining detailed kinematic features of small-radius 
jets are limited in that they rely on ad hoc fitting functions to perform background extrapolations. 
If a search that utilized these procedures yields an excess of events, there is no guidance for 
investigating the discrepancy because the functions are not derived from an underlying theory. 2 

Searches that use fat jets can implement an alternate strategy to estimate backgrounds. For the 
substructure analysis proposed here, one can study the internal structure of fat dijets. Because this 
sample should be signal poor, it can be used to determine the pure QCD dependence of jet mass 
and substructure on other quantities like jet pr- These results can then be extrapolated to four fat 
jet events, and should lead to reasonable background predictions so long as the correlations between 
fat jets are small. Importantly, the associated systematics for a fat jet analysis differ from those 
that dominate in a search for many small-radius jets. It is beneficial to have competing searches 
with different systematics to ensure that new physics is not overwhelmed by large uncertainties. 

Finally, we note that our analysis does not rely on the presence of missing transverse energy 
(J^V), which is typically crucial for discriminating against multijet backgrounds in searches for 
supersymmetry (SUSY). Missing energy is not a robust prediction of SUSY models, e.g. -R-parity 



2 For recent theoretical progress on extrapolating jet multiplicity, see [42) . 
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[g^tt + 3j] [g^t + 2j] [g^Sj] 

FIG. 2: Gluino decay diagrams, illustrating topologies that can lead to as many as 18, 10, and 6-parton 
final states (left to right, respectively) when the gluinos are pair-produced. Note that g is a gluino, t is a 
stop, t is a top quark, q is a first or second generation squark, x is a neutralino, and j refers to a final state 
quark or anti-quark. 

can be violated, the superpartner spectrum can be squeezed, or SUSY can be stealthy [33l 133] . 
There are also a number of non-SUSY models that have signatures without such as J35-48J. To 
cover these and other J^-less theories at the LHC, it is imperative to develop new search strategies 
to efficiently reduce the QCD background. Such a strategy could rely on rare objects in the event, 
such as 6-jets or leptons, to further reduce backgrounds. However, a search that is independent 
of these extra handles is powerful for its generality. Because our proposal only relies on having a 
final state with many jets, it can be used to place limits on a wide-range of model space. 

We demonstrate that accidental substructure is a powerful discriminator by applying it to 
three distinct gluino g decay scenarios when the i?-parity violating (RPV) superpotential coupling 
U c D c D c is non-zero: 

g^tt + 3j, g^t + 2j, and g^3j. (1) 

Here j refers to a final state quark or anti-quark, not to a detector- level jet. When the gluinos 
are pair-produced, these three topologies can lead to as many as 18, 10, and 6-parton final states, 
respectively, as shown in Fig. [2] The first topology arises when a gluino decays to a pair of tops and 
an unstable neutralino, which decays to three partons through an off-shell squark via U C D C D°. 



6 



The other two topologies correspond to the RPV gluino decays into tbs and uds final states. For 
a review of constraints on these RPV interactions, see |49| . The 18 and 10-parton topologies are 
particularly well-motivated theoretically because the top quarks in the final state can result from a 
light stop in the spectrum. This is a plausible scenario with minimal fine-tuning where the non-zero 
RPV couplings suppress thereby hiding SUSY from current searches |50| . In particular, the 
10-parton topology was the focus of a recent proposal that used substructure techniques to look 
for boosted stops [26] . 

The remainder of this paper proceeds as follows. In Sec. [ITJ we present the needed variables, 
jet mass and Y-subjettiness, and introduce the concept of "event-subjettiness." In Sec. Ill, we 
show how these tools can be combined into a full analysis. After a brief description of the event 
generation procedure, we present the expected limits for the different gluino decay topologies. 



We conclude in Sec. |IV[ Appendix [A] contains a detailed description of our simulations, including 
validation plots. 

II. QUANTIFYING ACCIDENTAL SUBSTRUCTURE 

Our analysis relies on two observables: total jet mass and event-subjettiness. The latter is a 
new variable that we introduce to quantify the amount of accidental substructure in an event. It 
requires V-subjettiness to characterize the subjet nature of each jet. Jet mass, Y-subjettiness, 
and event-subjettiness form the cornerstone of our analysis, so we introduce them individually 
here. The full analysis strategy is presented in Sec. |III| and the details of our Monte Carlo event 



generation, detector mock-up, and validation can be found in Sec. Ill A and Appendix [A} 

For the figures in this section, we select 8 TeV LHC events with at least four jets, clustered using 
the anti-fcy algorithm |39j with cone size R = 1.2. The transverse momenta of the leading and 
subleading fat jets must satisfy px > 100 GeV and pr > 50 GeV, respectively. Although no 8 TeV 
multijet, j£^-less triggers are publicly available, we anticipate that they will not be significantly 
more restrictive than existing 7 TeV examples: five or more jets (R = 0.4) with pt > 30 GeV at 
ATLAS [37J, ~ 500-750 GeV of H T at CMS [H], and 4, 6, or 8 high-p r jets (R = 0.5) at CMS [EE]. 
We have verified that the first of these triggers is 100% efficient for the QCD background and the 
gluino topologies we consider after final selection cuts. 
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A. Jet Mass 

Standard SUSY searches at ATLAS and CMS use a combination of missing energy, $r, an d 
visible transverse energy, 

Ht = Y. VWi + m|, (2) 

3=1 

where j is a jet in the event with mass rrij = \J — \pj\ 2 and Nj is the number of jets in the event 
with pt > 50 GeV. The total jet mass of an event, 

Mj = J2mj, (3) 

3=1 

is a more powerful discriminator than Ht in searches for high multiplicity final states [52J because 
a jet's mass automatically encodes gross kinematic features of its constituents. 

Consider a small-radius jet that is seeded from an isolated parton. In the absence of showering, 
this jet will have zero mass. Non-zero jet mass arises if multiple partons are clustered together 
and/or from QCD radiation — the former yields a larger jet mass than the latter. As a result, a 
QCD and signal event with equivalent Ht can have different total jet mass. More quantitatively, 
Ht can be related to Mj via 

N 3 N 3 I 7 yi 

Ht = E y/&)i + ™? « EVK>((^)- 2 + 1 ) " Mj K R ' (4) 

3 = 1 3 = 1 

where k ~ y/cts for jets whose mass is generated from the parton shower pQ and n ~ 1 for fat 
jets that contain multiple hard partons accidentally clustered in the same jet. Figure [3] shows the 
Ht and Mj distributions for background and a signal example. Clearly, a cut on Mj improves 
sensitivity to the signal as opposed to an Ht requirement. 

The authors of proposed a study that took advantage of total jet mass for high multiplicity 
signals, but which still relied on a missing energy cut. In this work, we demonstrate that accidental 
substructure increases sensitivity when used in conjunction with total jet mass. This result is 
especially useful in topologies with $ T suppression, such as the benchmarks presented in Fig. [2| 
Adding a moderate $t cut for other topologies that do contain sources of missing energy, e.g. new 
physics signals with tops in the final state, can provide an additional handle for improving the 
discriminating power of accidental substructure and jet mass |53j. 
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FIG. 3: The Ht (left) and Mj (right) distributions for the backgrounds and an example signal. The signal 
(red solid line) is pair production of a 750 GeV gluino with g — > ti + 3j. The stacked histogram is for 
background (QCD in solid blue, W ± /Z° + 4j in hatched magenta, and ti+ j in striped green). Mj is a 
more powerful discriminator than Ht when comparing signal to background. 

B. TV-subjettiness 

To quantify accidental substructure, we begin by considering the ./V-subjettiness variable tn p~U 
[19]. tat is a measure of the degree to which a fat jet has ./V well-separated subjets. For each jet, 
ttv is defined as 

P i 

dp = J2(PT)iRl (5) 

% 

where the minimization is performed by varying N axes, Rq is the choice of clustering radius, and 
ARi t M = \J '(A0i,A/) 2 + (Ar/^jvf) 2 denotes the angular distance between the i th constituent particle 
and the M th axis. We take (3 = 1 and R = 1.2. 

To elucidate what iV-subjettiness measures, consider T3. If the jet consists of three or fewer well- 
collimated subjets, T3 ~ because minjAi?^!, Ai?^, Ai?^} vanishes for the i th constituent. If the 
fat jet contains more than three subjets (or the particles making up the jet are not well-collimated) , 
T3 > because at least one subjet is not aligned with an axis. 
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While the individual r/v are not typically useful, ratios are [Hj. For example, 

tnm = tn/tm (6) 

is efficient at selecting iV-subjetty events for M < N. For a jet with iV well-separated subjets, tja 
is large, tn is small, and therefore t^m is much less than 1. Rejecting events with tnm ~ 1 selects 
for jets that are more iV-prong like. 

Figure [4] shows the normalized distributions of T43 for each of the four hardest jets for QCD and 
the 5—7- ti+ 3j topology. The jets in each event are ordered by decreasing pp. The background 
sample is peaked around T43 ~ 0.7-0.8. In contrast, the distribution for the signal is shifted to lower 
values, with a tail that extends to lower T43. These distributions reinforce the general conclusions 
we drew from the lego plots in Fig. [T] Specifically, T43 is shifted towards lower values for the signal 
relative to the background, suggesting that signal jets typically look more four-subjetty than the 
background jets. 



C. Introducing Event-subjettiness 

iV-subjettiness is useful for characterizing the number of subjets in a single jet. However, it 
would be useful to have a variable that takes into account the relative abundance of jets with 
substructure in an entire event. To this end, we introduce "event-subjettiness," T^m, which is 
defined as the geometric mean of the tmm for the four hardest jets in an event: 



■ NM 



1 



n 1 

3=1 



1/4 



(7) 



The more jets with substructure in an event, the more jets with a small tnm, resulting in a smaller 
value of T^m ■ The geometric mean is less sensitive to the presence of a single high tnm in an event 
than the arithmetic mean. In particular, the arithmetic (geometric) mean tends to result in slightly 
larger S/B (S/yB) than the geometric (arithmetic) mean. This leads to a mild improvement in 
the reach when using the geometric mean. We also explored placing cuts on combinations of the 
tjvm for the single two hardest jets; this does not lead to the same level of discriminating power 
because the amount of substructure is not necessarily correlated with the hardness of a jet. 

Figure [5] illustrates the distributions of T43 for backgrounds and the signal example with g — > 
ti+3j. For this topology, many of the signal fat jets often have four or more subjets, which drives 
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down T43 relative to that for the backgrounds. This is evident, for example, in Fig. [T] where the 
signal event has T43 = 0.45 compared to 0.73 for the QCD event. As Fig.[5]shows, after a cut on the 
total jet mass (right panel), the ratio of signal to background improves relative to no total jet mass 
cut (left panel). The right panel suggests that the signal and background can be distinguished by 
applying an additional cut T43 < 0.6. We demonstrate the efficacy of this strategy in the following 



Jet 1 Jet 2 




FIG. 4: Normalized distributions of T43 for background and a signal example. Each plot shows the normalized 
distribution before a cut on Mj. The signal (red solid line) is pair production of a 750 GeV gluino with 
g —> ti + 3 j. The solid blue histogram is for the QCD background. Each panel is the distribution for the 
j th jet; the order is by decreasing px- Note that the top and electroweak backgrounds are subdominant and 
are not shown here. 
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FIG. 5: Distributions of T43 for backgrounds and an example signal, with Mj > (left) and Mj > 500 GeV 
(right). The signal (red solid line) is pair production of a 750 GeV gluino with g — > tt + 3j. The stacked 
histogram is for background (QCD in solid blue, W /Z° + 4j in hatched magenta, and tt + j in striped 
green). A cut on T43 < 0.6 helps to distinguish signal from background, after requiring Mj > 500 GeV. 



section when we estimate the sensitivity to the signal topologies in Fig. [2} 



III. ANALYSIS STRATEGY 



Having presented the individual components of our analysis, we now combine them and present 
the complete search strategy. To illustrate the effectiveness of this approach, we compute expected 
limits for the three different RPV gluino decay chains in Fig. [2] Of course, our proposal is quite 
general and can be applied to any high-multiplicity final-state. 



A. Event Generation 

We begin by briefly describing the generation of signal and background events. Appendix |A| 
contains a more detailed description of the detector mockup and Monte Carlo validation. 

QCD is the dominant background for a multijet signal with no missing energy. Sherpa 1.4.0 |54F 
[58] is used to generate and shower ~ 400 million inclusive pp —tnj events, where n E (2, ... ,6). 
Matrix elements for up to 6 partons are generated, which are then matched to the parton shower 
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using the CKKW procedure [59]. All Sherpa events are generated using the default CTEQ 6.6 
parton distribution function |60] and include the effects of underlying event. We generated a 
sample of weighted events in order to increase the statistical power of our finite sample. The 
Monte Carlo error, £mc, after cuts is 



where W{ is the weight of the i event in the sample. We verify that the Monte Carlo error is less 
than the systematic error for the signal regions of interest. 

For consistency, Sherpa is also used to generate additional subleading background contributions. 
In particular, we generate ~ 25 million matched and weighted ti+ntj events, where the tops are 
forced to decay hadronically. We also simulate ~ 25 million matched and weighted data sets for 
each electroweak background: W + + nw j, W~ + ny/ j, and Z° + nz j, where the gauge bosons 
are forced to decay to quarks. Here, nj G 0,1 and nw, nz £ 1,2,3,4. Table [I] shows that these 
non-QCD backgrounds are subdominant. This would not be the case if a J^t cu t were also applied. 

The matrix elements for gluino pair production are generated in MadGraph5 1.4.8.4 [61] for the 
5—7- tt + 3 j topology. Those for the g — > t + 2j and g — >■ 3j topologies are generated directly 
in Pythia 8.170 [62-64j, where the RPV gluino is allowed to hadronize before decaying. All three 
signal topologies are generated using the default CTEQ6L1 PDF set [651 [66] an d are showered and 
hadronized in Pythia including the effects of underlying event. Because the gluinos are produced 
at threshold and decay to several fairly hard jets, it is not necessary to perform matching. 

Both signal and background events are passed through our own detector mockup, which only 
includes the effects of detector granularity. FastJet 3.0 |6T|, [68] is used to cluster events into 
anti-/cT [3.9J jets with R = 1.2. Variables such as jet mass and substructure are sensitive to 
soft, diffuse radiation that results from underlying event and pile- up. The ATLAS study in [55] 
explicitly demonstrated that the mean jet mass for anti-fey jets with R = 1.0 and pr > 300 GeV 
is constant with respect to the number of pile- up vertices for 35 pb~ x of 7 TeV data, after a 
splitting/filtering procedure is applied. For variable multiplicity fat jets, which is quite typical for 
accidental substructure, filtering is not the optimal grooming technique because it places a fixed 
requirement on the number of subjets within the fat jet [4]. Instead, to reduce the contamination 
due to soft radiation resulting from underlying event, we apply the trimming procedure of jllj . 




(8) 
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Requirement 


QCD 


ti+ j 


W ± /Z° + 4j 


g^tt + 3j 


g^t + 2j 


.9 -> 3j 


(1) N 3 = 4 


5.8 x 10 6 


4500 


1.0 x 10 4 


680 


7200 


4800 


(2) Mj > 500 GeV 


6800 


8.4 


40 


400 


990 


640 


(3) T 43 < 0.6 or 


180 


0.61 


1.5 


75 


110 


(48) 


T 2 i < 0.2 


77 


0.047 


1.1 


(1.7) 


(27) 


39 



TABLE I: Event yields from our Monte Carlo simulation, assuming 5 fb _1 of 8 TeV data and taking the 
gluino mass to be 750 GeV for g —> ti+Zj and 500 GeV for the other two topologies. The table shows the 
number of events after requiring (1) four fat jets with mj > 20 GeV and the appropriate px requirements 
(see text), then (2) a cut on the total jet mass, and then (3) a cut on event-sub jettiness for a given choice of 
Tnm- Yields are shown for two different Tnm cuts that are optimized for the 18, 10, and 6-parton topologies; 
the number of events that corresponds to the best choice for this cut is bolded while the non-optimal choices 
are in parentheses. 

We require any subjets of radius R = 0.3 to have a px greater than 5% of the fat jet's transverse 
momentum. This choice of parameters is motivated by a recent ATLAS analysis |32j . We find 
that trimming eliminates the dependence on the different underlying event models used by the 
generators. 

Prospino 2.1 [69] is used to obtain the NLO production cross section for the gluinos. For 
the QCD background, we use a .fT-factor of 1.8, obtained by comparing distributions of the 
generated QCD Monte Carlo with published distributions in [331 CO] (see Appendix [A] for details 
on validation). All other backgrounds are subdominant and our analysis is therefore insensitive 
to the exact choice of their cross sections. We use the Sherpa leading order predictions for these 
backgrounds. 

B. Expected Reach 

Now, we are ready to compute the expected reach of our analysis. All events are required 
to satisfy the following criteria. Each event must have at least four fat jets, where the px of 
the hardest jet is at least 100 GeV and the pt of the next three hardest jets is at least 50 
GeV. To reduce contamination of heavy flavor resonances and high-p^ QCD jets with no hard 
splittings, only jets with rrij > 20 GeV are considered. To further reduce QCD and 1 1 background 
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contributions, each event must have at least 500 GeV of total jet mass, Mj. Finally, a cut is placed 
on event-subjettiness, T^m- The cuts for Mj and T/vm were selected to maximize significance, 
while ensuring that the Monte Carlo error remained below the systematic error. This requirement 
imposes a significant limitation on our ability to fully optimize the search and is the reason we 
only present one set of cuts. Table [I] summarizes the cut efficiencies on signal and background. 

To determine the expected reach for the three topologies in Fig.[2j we assume that the probability 
of measuring n events is given by the Poisson distribution with mean fi = B + S, where B and S are 
the number of expected background and signal events, respectively. The probability of measuring 
up to N m events is 



P(N m \n) = e-»J2^- (9) 

n=0 

This expression assumes that there is no uncertainty in the value for B. In the presence of a 
systematic uncertainty e sys , Eq. ^ must be convoluted with the probability distribution of B, 
which we assume is log- normal because B > 0: 

/•oo 

P sys (N m ,S,B) = / dx P(N m \S + x)-lnM(x), (10) 
J o 



where lnA/"(x) = —?= — exp 



(lnz-lnB) 2 



2ci 



sys 



Note that as e S y S — y 0, the log- normal distribution 



becomes a delta function centered at B and Eq. ( 10 ) reproduces the standard result for a Poisson 



distribution. To obtain the expected limit on the signal cross section, we solve Eq. (10) for S 
assuming that N m = B and P sys = 0.05 (95% exclusion). We find that the expected limits are not 
sensitive to the distribution function chosen for B; a Gaussian distribution gives essentially the 
same result. 

An ATLAS analysis of the full 2011 dataset reported a jet mass scale systematic uncertainty of 
~ 4-8% (depending on jet pr) for anti-fc^ trimmed jets with R = 1.0 [32]. For four fat jets, this 
gives at most a 16% systematic uncertainty. In order to be conservative, we take e sys = 20% when 
computing sensitivities. 

We begin by considering gluino pair production with g — >• ti + 3 j. This topology can yield up 
to 18 partons when the tops decay hadronically. For this final state, the T43 event-subjettiness 
variable is most effective. For a 750 GeV gluino, a cut of T43 < 0.6 increases S/B from 0.06 to 
0.42, and S/\/P from 4.9 to 5.6 as seen in Table |l| Figure [i] shows the expected reach for 5 fb _1 



15 




600 800 

Gluino Mass (GeV) 



FIG. 6: The 95% expected exclusion curves for the g —> tt + 3j topology at the 8 TeV LHC with 5 fb _1 
of data. The solid grey curve is the NLO prediction for the gluino pair production cross section computed 
using Prospino, the dashed red curve is the expected exclusion including all cuts except the one on event- 
subjettiness, and the solid red curve is the exclusion when T43 < 0.6 is imposed. A systematic error 
e sys = 20% is assumed for the background prediction. Cutting on event-sub jettiness improves the reach by 
- 350 GeV. 

of 8 TeV data. The gray line is the NLO gluino pair-production cross section, as evaluated by 
Prospino. The dashed red line shows the expected limit when all cuts are applied, except that on 
event-subjettiness. With the additional cut on T43, the expected limit improves by ~ 350 GeV, as 
illustrated by the solid red line. Requiring jets with accidental substructure significantly extends 
the reach beyond a search that relies on total jet mass alone. 

Event-subjettiness is an effective variable for other RPV gluino decay chains. However, as the 
number of hard partons decreases, the signature of accidental substructure becomes more subtle. 
Consider the middle diagram of Fig. [2] where g — > i + 2 j. The 8 TeV, 5 fb _1 expected limits on this 
final state are extended from 400 GeV to 600 GeV when T43 < 0.6 is required in addition to a jet 
mass cut. For a 500 GeV gluino, cutting on substructure improves the signal to background ratio 
from 0.14 to 0.61 as seen in Table |I} Due to the smaller number of partons, the improvement in 
significance is not as dramatic as for the g — > ti+3j topology described previously. Here, the main 
advantage of cutting on substructure is to increase S/B. This provides a significant improvement 
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because systematic uncertainties tend to drive the sensitivity in the signal region when QCD is the 
dominant background. 

Lastly, we consider the 6-parton topology illustrated in the right-most diagram of Fig. [2] Of 
the three decay modes studied in this work, this has the fewest partons and is therefore the most 
challenging to observe. In particular, provides the best discriminating power for this topology. 
The left panel of Fig. [7] shows the T21 distribution for background and signal after applying a 
Mj > 500 GeV cut. The background is peaked between 0.35-0.4 and the signal is peaked at 
0.25-0.35. The right panel of Fig. [7] shows the expected exclusion for the 6-parton final state, 
assuming 5 fb _1 of 8 TeV data. The dashed red line shows that the expected limit is ~ 350 GeV 
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-1 1 1 1 1 1 r- 
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CD 

x 
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Mj > 500 

Mj > 500, 7~ 21 < 0.2 _ 
ATLAS 6j 




200 



400 600 800 
Gluino Mass (GeV) 



FIG. 7: The T21 distribution for signal and background after requiring Mj > 500 GeV (left) and 95 % 

expected exclusion (right) for the g — >■ 3 j topology at the 8 TeV LHC with 5 fb _1 of data. 

left: The signal (red solid line) is pair production of a 500 GeV gluino with g — > 3 j. The stacked histogram 

is for background (QCD in solid blue, W /Z° + 4 j in hatched magenta, and ti+j in striped green). A cut 

on T21 < 0.2 effectively distinguishes signal from background, after requiring Mj > 500 GeV. 

right: The solid grey curve is the NLO prediction for the gluino pair production cross section computed 

using Prospino, the dashed red curve is the expected exclusion including all cuts except the one on event- 

subjcttiness, and the solid red curve is the exclusion when T21 < 0.2 is imposed. For comparison, the 

green dotted line shows our reproduction of the ATLAS search for this same topology [37] . Our analysis is 

competitive with the ATLAS reach. A systematic error e sys = 20% is assumed for the background prediction. 

A cut on event-subjettiness improves the reach by ~ 250 GeV. 
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before a cut on event-subjettiness. The expected limit increases to ~ 600 GeV when T21 < 0.2 is 
required (the solid red line). As in the last example, the improvement in the limit arises from an 
increase in the ratio of signal to background after substructure cuts. 

The expected reach of our substructure analysis for RPV gluinos is ~ 600 GeV and compares 
favorably with published limits from CMS and ATLAS. The CMS search for three-jet invariant 
mass resonances @U] excludes an RPV gluino from 280-460 GeV with 5 fbT 1 of 7 TeV data. The 
ATLAS analysis for this final state, published with 4.6 fb _1 of 7 TeV data, uses two techniques to 
provide exclusions |37j . They perform a boosted gluino analysis that makes use of jet substructure 
and can exclude the gluino in the range 100-255 GeV. 3 A separate "resolved" analysis uses the 
Pt of the sixth jet (anti-/c*r, R = 0.4) to separate signal from background, and excludes the gluino 
from 100-666 GeV. 

To provide a direct comparison, we reproduce the ATLAS resolved analysis by reclustering 
our background and signal into anti-fey jets with R = 0.4 and applying the cuts from [37]. The 
projected limit for 5 fb _1 of 8 TeV data is shown by the green dotted line of Fig. [7] and gives a 
limit of about 550 GeV. 4 This demonstrates that our projected limit, which relies on accidental 
substructure is competitive to that from the ATLAS resolved analysis. 

To emphasize the effectiveness of our approach, we also performed a naive comparison between 
our method and the ATLAS resolved jet analysis of [37] as applied to the g — >■ t i + 3 j topology. 
The ATLAS search is not optimized for this signal; in particular, for this topology relying on 
6-jets and/or leptons may be a more effective strategy. However, it provides a rough guide for a 
small-radius jet (with R ~ 0.4) analysis that one might consider when searching for this multitop 
topology. We find that there is no bound on the gluino mass for the 6-jet cuts proposed in |37| . 
In principle, the signal region could be extended to a larger jet count. In that case, however, 
background estimation can be quite challenging. On the other hand, the accidental substructure 
analysis outlined in this paper is broadly applicable to signals with different jet multiplicities. 

3 The recent theory work in 38 finds that the limit on boosted RPV gluinos can be increased by searching for a 
peak in the jet mass spectrum. 

4 Note that our expected limit of 550 GeV is weaker than that in [37|, although it does fall at the edge of the 
published 1-sigma uncertainty. We can reproduce their limit if we take a if-factor of 1.0 for the QCD background. 
For consistency with the validation plots from Appendix [Al we use the more conservative 1.8 K-iactor for Fig. [7| 
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IV. CONCLUSION 

In this paper, we introduced the concept of accidental substructure and illustrated its usefulness 
in searches for high-multiplicity final states and no missing energy. Accidental substructure arises 
because there is a high likelihood that several final-state partons will be clustered together in the 
same large-radius jet. These final state partons need not have originated from the same parent 
particle. QCD is the dominant background. Having several partons in a QCD event that undergo 
a large-angle, hard splitting is rare enough to make accidental substructure a useful discriminator. 

We analyzed three RPV gluino decay topologies with as many as 18, 10, and 6 partons in the 
final state. The requirement that the total jet mass be greater than 500 GeV, in conjunction with a 
cut on event-subjettiness, proved to be very effective. We found projected limits of 0(800 GeV) for 
the g^tt + 3j topology, O(600 GeV) for the g t + 2 j topology, and O(600 GeV) for the g 3 j 
final state with 5 fb _1 of 8 TeV data. These projections assume a 20% systematic uncertainty and 
a conservative i^-factor for the normalization of the QCD background. Our goal was to illustrate 
the general applicability of a search using accidental substructure and we expect that many aspects 
of this analysis can be further optimized. One possibility, for instance, is to use a neural network to 
select the appropriate A-subjettiness variables to include in the evaluation of event-subjettiness. 
Also, we have not explored how the sensitivity of the search depends on jet radius. 

In the case of the 6-parton final state from RPV gluino decays, our expected limit is comparable 
to that set by the ATLAS small-radius jet analysis [37]. Determining the normalization of the QCD 
background for a 6 (or more) small-radius jet signal is challenging. As a result, it is important to 
have a complementary search with independent systematics. Our accidental substructure search is 
one possible example and is, in addition, sensitive to a broader array of signals than the ATLAS 
search. In particular, its sensitivity only improves as the number of final-state partons increases, 
as we showed for the 10 and 18-parton final states. 

Events with many tops can lead to many jets in the final state (the scenario we consider here), 
but other decay channels can give leptons and $t- Analyses that tag on a lepton and several 6-jets 
can be sensitive in these cases |71j . We also expect our reach to improve significantly when 6-tags 
are included [72\ . Alternatively, the total energy St may be useful; while it provides the greatest 
discriminating power in black hole searches [TSJ [73], the St cut must be above several TeV to 
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adequately reduce the multijet background. Tagging on a lepton in addition to six or more jets, 
could allow an St cut down to ~ 1 TeV [75J . 

The search we proposed here is complementary to these types of analyses. We expect that its 
potential reach will only increase by adding additional handles. For example, we find that naive 
cuts on jet mass and event-subjettiness lead to a limit ong-> ti+ $t that is only slightly weaker 
than the current bounds from CMS and ATLAS. Adding a lepton, a 6-tag and/or a small cut 
on fjT could make the search even more powerful. We explore the potential of such analyses in 
follow-up work [53] . 

A significant advantage of using fat jets to study final states with many partons is that it is 
compatible with data-driven determinations of the QCD background. Mapping out the phase- 
space of high multiplicity QCD with Monte Carlo is currently not possible. For a fat jet analysis, 
one can use a dijet sample to map out distributions of the internal structure of the jets and to 
obtain templates for jet mass and substructure as a function of the jet kinematics. Under the 
mild assumption that the correlations between fat jets are small, one only needs to predict the 
phase space distribution of the four fat jets, while the internal properties of each fat jet can be 
modeled using the template functions derived from the dijet events. This simple algorithm allows 
an extrapolation of the QCD contribution to the four fat jet signal region. 

The possibility of using a jet's internal structure to learn about its origin provides exciting 
opportunities for new physics searches at the LHC. Although jet substructure has only been used 
for boosted signals thus far, this work demonstrates that it is also applicable in the non-boosted 
regime. We have shown that accidental substructure provides a robust and powerful new paradigm 
for new physics searches at the LHC, complementing and extending the reach of current analyses. 
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Note Added 

A related work will appear [76], which proposes a method of subjet counting and applies it to 
searches for high-multiplicity signals. 
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Appendix A: Simulation Details and Validation 

In this appendix, we discuss the details of our simple detector mockup and provide validation 
plots comparing our QCD Monte Carlo to a number of public distributions from ATLAS. We extract 
a A-factor to normalize our QCD sample and demonstrate that our Monte Carlo reproduces the 
measured shapes of substructure and jet mass distributions to reasonable accuracy. 

We simulate detector granularity by clustering stable, visible generator-level particles into r/x (ft 
cells of size 0.1 x 0.1. Electrons, muons, and photons are kept if they fall within \rj\ < 2.5, while 
all other particles are kept if they fall within \r/\ < 3.0. Each calorimeter cell is assigned a light- 
like vector with energy equal to the sum of all particle energies contained therein. Fast Jet 3.0 
clusters these four-vectors into anti-Zc-r jets and computes A-subjettiness for the resulting jets 
using the "min_axes" algorithm, implemented in the A-subjettiness plugin of Thaler and Van- 
Tilberg |14|ll9j. Note that leptons are included in jet clustering and when calculating substructure 
variables. A jet is removed if it is within AR < 0.2 of a lepton and its pt is less than twice the 
lepton's pt- 

We validate our QCD Monte Carlo by comparing against published kinematic and substructure 
distributions. No published 8 TeV substructure results are currently available, and so we compare 
against the published 7 TeV ATLAS results [331 133 EO]- A weighted sample of pp — > nj, where 
n £ (2, ... ,6), is generated in Sherpa 1.4.0. Our 7 TeV sample consists of ~ 50 million events 
and is generated with the same settings as our ~ 400 million event 8 TeV Sherpa sample, described 
in Sec. IfflAl 

To validate the shape of the jet mass and substructure distributions, we follow the analysis 
in |33| and compare to the unfolded distributions. Particles are clustered into anti-Zc-r jets with 
R = 1.0. The resulting jets are divided into four equally-spaced pt bins from 200 to 600 GeV. The 
jet mass (T21 and T32) distributions are shown in the top (bottom) of Fig. [8] for px £ (200,300). 
The Monte Carlo predictions are well within the error bands quoted by ATLAS. We checked that 
the Sherpa results for the higher px bins, not shown here, also match the ATLAS results. 

Sherpa outputs a leading order (matched) cross section of <7q 1 ^ ) pa = 9.6 x 10 9 fb. Because this 
cross section is enhanced by loop effects, we must find the proper normalization, or A-factor, for 
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the QCD background: 



ts Shcrpa 
O-QCD = K X cj qcd . 



(Al) 



Using the reported 2-jet inclusive cross-section in |70| . we obtain a -ftT-factor of ~ 1.3. Comparing 
to the 6 th jet pp distribution in |37j . we obtain a K- factor of 1.8. Furthermore, by comparing the 
normalization of the jet mass, T21 and T32 distributions in |33| we obtain a ET-factor of 1.8. To be 
conservative, we assume a K- factor of 1.8 in this work. 



200 <P T < 300 



b 
73 



0.02 



0.015 



0.01 



0.005 



0^ 



Sherpa QCD 
ATLAS 



200 < P T < 300 



b 1-5 



0.5 



■■■ Sherpa QCD 
ATLAS 

1 ■ ■ ' 

* , , | , , , L 

0.5 



50 100 150 200 
jet mass (GeV) 



b 

T3 



200 <P T < 300 
- ■■■■Sherpa QCD 



b 3- ATLAS 



0.5 



'21 



'32 



FIG. 8: Jet mass (top) and A^-subjettiness (bottom) comparisons between the Sherpa QCD prediction 
[dotted red] and the ATLAS results [black rectangle] of [33] . The green band is the combined statistical and 
systematic error in the ATLAS measurement including the uncertainty from the unfolding procedure. 
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